Human and machine consonant recognition

نویسندگان

  • Jason J. Sroka
  • Louis D. Braida
چکیده

Three traditional ASR parameterizations matched with Hidden Markov Models (HMMs) are compared to humans for speaker-dependent consonant recognition using nonsense syllables degraded by highpass filtering, lowpass filtering, or additive noise. Confusion matrices were determined by recognizing the syllables using different ASR front ends, including Mel-Filter Bank (MFB) energies, Mel-Filtered Cepstral Coefficients (MFCCs), and the Ensemble Interval Histogram (EIH). In general the MFB recognition accuracy was slightly higher than the MFCC, which was higher than the EIH. For syllables degraded by lowpass and highpass filtering, automated systems trained on the degraded condition recognized the consonants as well as humans. For syllables degraded by additive speech-shaped noise, none of the automated systems recognized consonants as well as humans. The greatest advantage displayed by humans was in determining the correct voiced/unvoiced classification of consonants in noise. 2005 Elsevier B.V. All rights reserved. PACS: 43.71.E; 43.71.G; 43.72.D; 43.72.N

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The interspeech 2008 consonant challenge

Listeners outperform automatic speech recognition systems at every level, including the very basic level of consonant identification. What is not clear is where the human advantage originates. Does the fault lie in the acoustic representations of speech or in the recognizer architecture, or in a lack of compatibility between the two? Many insights can be gained by carrying out a detailed human-...

متن کامل

پیش‌بینی قابلیت فهم همخوان‌ها در افراد دارای شنوایی عادی با استفاده از مدل‌های میکروسکوپی دارای معیار فاصله‌ مختلف در بازشناساگر خودکار گفتار

In this study, recognition rates of consonants available in vowel-consonant-vowel structure in hearing tests and two microscopic models will be investigated. Such a syllable structure doesn’t exist in Farsi and Azerbaijani languages, but since the goal is only recognition of middle phoneme, according to hearing tests, listeners are able to properly recognize phonemes in clean speech conditions....

متن کامل

Facial Expression Recognition Based on Anatomical Structure of Human Face

Automatic analysis of human facial expressions is one of the challenging problems in machine vision systems. It has many applications in human-computer interactions such as, social signal processing, social robots, deceit detection, interactive video and behavior monitoring. In this paper, we develop a new method for automatic facial expression recognition based on facial muscle anatomy and hum...

متن کامل

Two-stage Isolated Consonant-Vowel (CV) Unit Recognition in Indian Languages

This paper addresses the issues in the recognition of the most frequently occurring Consonant-Vowel (CV) units of speech in Indian languages. Two major issues in the recognition of CV units are the large number of CV classes and high similarity among several CV units. In this paper, we propose two level acoustic models comprising of Hidden Markov Models (HMM) and Support Vector Machines (SVM) t...

متن کامل

State Space Point Distribution Parameter for Support Vector Machine Based Cv Unit Classification

In this paper we extend Support Vector Machines (SVM) for speaker independent Consonant – Vowel (CV) unit classification. Here we adopt the technique known as Decision Directed Acyclic Graph (DDAG) , which is used to combine many two class classifiers into multiclass classifier. Using Reconstructed State Space (RSS) based State Space Point Distribution (SSPD) parameters, we obtain an average sp...

متن کامل

Perceptual compensation for the effects of reverberation on consonant identification: A comparison of human and machine performance

Human listeners are able to perceptually compensate for the effects of reverberation on speech recognition, by exploiting information gleaned from prior exposure to the reverberant environment. We present a computer model of perceptual compensation for reverberation implemented within a hidden Markov model speech recogniser, in which different reverberant speech models are selected depending on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2005